From Ukkonen to Mccreight and Weiner: a Unifying View of Linear-time Suux Tree Construction
نویسندگان
چکیده
We review the linear time suux tree constructions by Weiner, McCreight, and Ukkonen. We use the terminology of the most recent algorithm, Ukkonen's online construction, to explain its historic predecessors. This reveals relationships much closer than one would expect, since the three algorithms are based on rather diierent intuitive ideas. Moreover, it completely explains the diierences between these algorithms in terms of simplicity, eeciency, and implementation complexity. 1 Motivation and Overview Suux trees provide most eecient solutions to a \myriad" 4] of string processing problems. The suux tree for a string t really turns t inside out, immediately exposing properties like longest or most frequent subwords. The fundamental question whether w occurs in t can be answered in O(jwj) steps | independent of the length of t | once the suux tree for t is constructed. Thus it is of great importance that the suux tree for t can be constructed and represented in linear time and space. In spite of their basic role for string processing, elementary books on algorithms and data structures barely mention suux trees, and never give eecient algorithms for their construction 3, 21, 11, 1, 17, 7]. Recent exceptions are 22, 13]. The reason for this is historical: starting with the seminal paper by Weiner 26], suux tree construction has built up a reputation of being overly complicated. The purpose of the present paper is to correct this reputation | by working out what is essential about eecient suux tree construction, and what is unnecessary complexity.
منابع مشابه
A Comparison of Imperative and Purely Functional Suffix Tree Constructions
We explore the design space of implementing suux tree algorithms in the functional paradigm. We review the linear time and space algorithms of McCreight and Ukkonen. Based on a new terminology of nested suuxes and nested preexes, we give a simpler and more declarative explanation of these algorithms than was previously known. We design two \naive" versions of these algorithms which are not line...
متن کاملOn{line Construction of Suux Trees 1
An on{line algorithm is presented for constructing the suux tree for a given string in time linear in the length of the string. The new algorithm has the desirable property of processing the string symbol by symbol from left to right. It has always the suux tree for the scanned part of the string ready. The method is developed as a linear{time version of a very simple algorithm for (quadratic s...
متن کاملOptimal Suffix Tree Construction with Large Alphabets
The suux tree of a string is the fundamental data structure of combinatorial pattern matching. Weiner Wei73], who introduced the data structure, gave an O(n) time algorithm algorithm for building the suux tree of an n character string drawn from a constant size alphabet. In the comparison model, there is a trivial (n log n) time lower bound based on sorting, and Weiner's algorithm matches this ...
متن کاملSuffix Tree
SYNONYMS Compact suffix trie DEFINITION The suffix tree S(y) of a non-empty string y of length n is a compact trie representing all the suffixes of the string. The suffix tree of y is defined by the following properties: All branches of S(y) are labeled by all suffixes of y. • • Edges of S(y) are labeled by strings. • Internal nodes of S(y) have at least two children. • Edges outgoing an intern...
متن کاملAlgorithms on Strings, Trees, and Sequences
Linear-Time Construction of Suffix Trees We will present two methods for constructing suffix trees in detail, Ukkonen’s method and Weiner’s method. Weiner was the first to show that suffix trees can be built in linear time, and his method is presented both for its historical importance and for some different technical ideas that it contains. However, lJkkonen’s method is equally fast and uses f...
متن کامل